perm filename PRESID.2[W84,JMC]1 blob
sn#740338 filedate 1984-02-04 generic text, type C, neo UTF8
COMMENT ā VALID 00004 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 presid.2[w84,jmc] Second AAAI presidential message
C00005 00003 WE NEED BETTER (AND PERHAPS HIGHER) STANDARDS FOR RESEARCH IN AI
C00009 00004 We need paradigm problems also. It is important that they be
C00011 ENDMK
Cā;
presid.2[w84,jmc] Second AAAI presidential message
WE NEED BETTER (AND PERHAPS HIGHER) STANDARDS FOR RESEARCH IN AI
One of the complaints about AI (both from outsiders and
insiders) is that
is is difficult to tell from a paper exactly what has been
accomplished and how it advances the state of the art.
This complaint is sometimes posed in moral terms, e.g.
"These people are swindling the public".
Sometimes this has some truth in it, but, on the whole,
there is no reason to suppose that AI researchers are less
intelligent or less motivated to do good work than researchers
in other fields. Rather the largest problem is with the current scientific
state of the field.
To make the contrast consider the situation in pure mathematics.
A recent article in the %2Wall Street Journal%1 is about William Thurston
who got the Field Medal for his work in topology. It mentions his first
paper, published when he was 17, which settled the xxx conjecture. The
conjecture was that three perfect billiard balls on an infinite plane
surface could undergo at most three collisions before they scooted off
to infinity. Thurston showed that four collisions were possible.
Mathematics has hundreds of such problems, easy to state, and known
to be difficult enough so that anyone who can solve one of them can
get a certain amount of well-deserved instant reputation. One catch
is that while the problem is neat, it wasn't related to anyone having
a need to bounce perfect biliard balls off one another. Still it can't
be excluded that the problem will turn out to have some applied
signficance - even in computer science.
WE NEED BETTER (AND PERHAPS HIGHER) STANDARDS FOR RESEARCH IN AI
If we had better standards for evaluating research results
in AI the field would progress faster.
One of the complaints about AI (both from outsiders and insiders)
is that is is difficult to tell from a paper exactly what has been
accomplished and how it advances the state of the art. Some people
put the problem in moral terms and accuse others of trying to fool
the funding agencies and the public. However, there is no reason
to suppose that people in AI are less motivated than other scientists
to do good work. Indeed I have no information that the average quality
of work in AI is less than that in other fields. In my previous
message I grumbled about there being insufficient basic research,
but one of the reasons for this is the difficulty of evaluating
whether a piece of research has made basic progress.
I shall not concern myself in this article with vice.
but refer the reader to Drew McDermott's "Artificial Intelligence
and Natural Stupidity" in SIGART Newsletter xxx.
It seems that evaluation should be based on the kind of
advance the research purports to be. Here are some examples.
1. The research constitutes making the computer solve a
problem that hasn't previously been solved by computer. Let us
suppose that there are no theoretical arguments that the methods
are adequate for a class of problems but merely a program that
performs impressively on certain sample problems together with
some explanation of how the program works. The reader will not
easily be able to assure himself that the program is not overly
specialized to the particular example problems that have been
used in developing the program. It has often turned out that
other researchers have not been able to learn much from the
paper. Sometimes a topic is so intractable that this is the
best that can be done, but maybe this means that the topic
is too intractable for the present state of the art.
2. A better result occurs when a previously unidentified
intellectual mechanism is described and shown to be either
necessary or sufficient for some class of problems. An example is
the alpha-beta heuristic for game playing. Humans use it, but it
wasn't identified by the writers of the first chess programs. It
doesn't consititute a game playing program, but it seems clearly
necessary, because without it, the number of positions that have to
be examined is sometimes the square of the number when it is used.
We need paradigm problems also. It is important that they be
well enough defined so that it would be definite who solved them.
The Fredkin committee award of $5,000 to Ken Thompson for the
first chess program to reach master rating had some of the
desiderata; at least it was definite what was achieved. Unfortunately,
what was the advance in AI, if any, that made it possible is a
a lot less definite.
Here is a challenge to those whose programs understand English.
A news story and some questions about it use the following
vocabulary, where the words have been sorted into alphabetical
order. Besides the words listed, there are also some proper
names that will be in the story, but like the human reader,
your program will have to recognize them when they appear.
Fire up your understander with the domain knowledge relevant
to a crime in Brooklyn. When you're ready, I'll feed you the
story itself and some questions about it, also in English, and
we can see how well the understander does. If you want to haggle
about the terms of the problem, please do it in letters to the
editor of AI Magazine.